Thickness bound for nonlocal wide-field-of-view metalenses

Metalenses—flat lenses made with optical metasurfaces—promise to enable thinner, cheaper, and better imaging systems. Achieving a sufficient angular field of view (FOV) is crucial toward that goal and requires a tailored incident-angle-dependent response. Here, we show that there is an intrinsic trade-off between achieving a desired broad-angle response and reducing the thickness of the device. Like the memory effect in disordered media, this thickness bound originates from the Fourier transform duality between space and angle. One can write down the transmission matrix describing the desired angle-dependent response, convert it to the spatial basis where its degree of nonlocality can be quantified through a lateral spreading, and determine the minimal device thickness based on such a required lateral spreading. This approach is general. When applied to wide-FOV lenses, it predicts the minimal thickness as a function of the FOV, lens diameter, and numerical aperture. The bound is tight, as some inverse-designed multi-layer metasurfaces can approach the minimal thickness we found. This work offers guidance for the design of nonlocal metasurfaces, proposes a new framework for establishing bounds, and reveals the relation between angular diversity and spatial footprint in multi-channel systems.


LATERAL SPREADING VERSUS THICKNESS
shows ∆W max of the same metasurfaces under TE-polarized incidence. Figure S2 shows ∆W max of single-layer metasurfaces designed to have the hyperbolic phase profile [1,2] at normal incidence, quadratic phase profile [3][4][5] at normal incidence, and random phase profiles without (n in = 1) and with substrate (n in = 1.5) at FOV = 180 • . We observe the same relation here: ∆W max < h. This relation also holds when different FOV is considered, as shown in Fig. S3.

A. Basis of transverse channels
Consider the TM fields of a 2D system, with E = E x (y, z)x, where the refractive index is n in and n out on the incident (z < 0) and transmitted (z > h) sides respectively, as shown in Fig. S4. The Poynting flux for a monochromatic wave at ω is S4. Schematic of the system: light incident from a medium with refractive index n in onto an entrance aperture with diameter D in , propagates through the structure with thickness h, and exits through an output aperture with diameter D out into a medium with refractive index n out .
We expand the incident field near the input aperture, E in x (y, z 0), in a basis of truncated plane waves: where where the summation only includes propagating channels (with real-valued k a z ) as the evanescent ones do not carry flux. Based on the Nyquist-Shannon sampling theorem [6], a discrete sampling of the momentum space with spacing 2π/D in is sufficient to represent a band-limited function with bandwidth D in in real space. Therefore, we consider input wave numbers limited to within the FOV of interest. Note that sin FOV 2 = n in sin FOV in 2 . With this choice, the different basis functions of Eq. (S3) are orthogonal, and the cross terms in Eq. (S4) drop out.
Similarly, the transmitted field E x (y, z h) near the output aperture can be expanded in a flux-orthogonal basis of truncated plane waves where Note that the transmitted field can also have evanescent components, but here we only consider contributions from the propagating ones.

B. Transmission matrix in angular basis
Consider incident wave from a fixed angle so v a = δ aa . Following Fig. 3 of the main text, the transmitted field for an ideal lens is which we can project onto the preceding basis {g b } to get Since u b = ∑ a t ba v a = t ba , this gives the a-th column of the transmission matrix in angular basis. To summarize, we have (S12) The normalization constant A(θ in ) can be determined from flux conservation, ∑ b |t ba | 2 = 1, for an ideal lens with unity transmission.
To evaluate the transmission matrix in practice, we approximate the continuous integration over y in Eq. (S12) and in the evaluation of the inverse participation ratio (IPR) [Eq. (7) of the main text] with a discrete summation with spacing ∆y. To determine the resolution to use, we use ∆y = λ/4 and ∆y = λ/20 to evaluate the maximal lateral spreading ∆W max for D out = 200λ or 900λ, NA = 0.2 or 0.8, FOV = 40 • or 160 • . The relative difference of ∆W max between the two choices of ∆y is smaller than 5% in all cases, so we use ∆y = λ/4 in the following.
Eq. (S12) approximated with a discrete summation can be evaluated efficiently using fast Fourier transform (FFT), where N ≡ D out /∆y ∈ Z is the number of segments the interval y ∈ [−D out /2, D out /2] is discretized into, and {y n ≡ − D out 2 + ∆y 2 + n∆y} are the centers of those segments. This produces N rows of the transmission matrix with 0 ≤ b ≤ N − 1; we cyclically rearrange the output index b and keep the propagating channels within |k b y | < n out ω/c.

C. Transmission matrix in spatial basis
We define the transmission matrix in spatial basis, t(y, y ), by where only propagating waves within the input and output apertures are kept in the incident field E in x (y , z = 0) and in the transmitted field E x (y, z = h).
when |y| < D out /2. Comparing Eqs. (S14)(S15), we obtain the spatial transmission matrix In Eq. (S16), the spatial coordinates y and y are both continuous variables, which is redundant from the information point of view. Since the transmission matrix has a bandwidth of |k y | < (ω/c)n out in the output and a bandwidth of |k y | < (ω/c) sin (FOV/2) in the input, a discrete sampling of space with spacing ∆y = λ/n out 2 in the output and spacing ∆y = λ 2 sin (FOV/2) in the input should be sufficient, which is the Nyquist sampling rate. But accurate reconstruction would then require the Whittaker-Shannon interpolation. We skip the interpolation and simply use a finer resolution ∆y = λ/4 in the output as mentioned in Sec. 2B. Since no integration is needed over the input y , we sample the input with spacing ∆y = λ 2 sin (FOV/2) . Eq. (S16) with discrete sampling can also be evaluated efficiently with FFT and inverse FFT.
Note that in order to make the transverse channels orthogonal, we imposed in Sec. 2A that k a y and k b y have spacing 2π/D in and 2π/D out , which makes E in x (y , z = 0) and E x (y, z = h) periodic in y and y with a period of D in and D out respectively. Such an artificial periodic boundary leads to periodic replications in the spatial transmission matrix, as shown in Fig. S5, which makes the lateral spreading inaccurate for y near the two edges. To remove this unphysical periodic replication while maintaining orthogonality of the basis, we choose a D in that is larger than D out (specifically, D in = 5D out ). Such choice does not affect the conclusion of this study since the angular variation of the phase shift (and subsequently ∆W max and the minimal thickness) depends on D out , not D in .

OPTIMIZATION OF GLOBAL PHASE
The function ψ(θ in ) in Eq. (11) of the main text does not affect focusing quality, so we use it as a free parameter to minimize the maximal variation of ∆φ ideal (y, θ in ) with respect to θ in , which minimizes ∆W max and the associated thickness bound. ) '(

Convex optimization
The maximal lateral spreading ∆W max as a function of the output diameter D out after optimizing ψ(θ in ).
As described in the main text, a sensible choice is where · · · y denotes averaging over y within |y| < D out /2. To assess the performance of the ψ 0 (θ in ) above, we carry out a numerical optimization where ψ(θ in ) is varied to minimize the maximal phase-shift difference between all possible pairs of incident angles across all positions: where θ i in and θ j in represent different incident angles with |θ i,j in | < FOV/2, |y| < D out /2, and ∆φ ideal (y, θ in ; This is a convex optimization problem [7] because ∆φ ideal (y, θ in ; ψ) is a linear function of ψ. We use CVX [8,9], a package for specifying and solving convex programming problems, to find the global optimum of Eq. (S18). The global optimum of ψ(θ in ) is in close agreement with the ψ 0 (θ in ) in Eq. (S17), as shown in Fig. S6(a). While there are small differences among the two, such differences have no noticeable effect on the resulting maximal lateral spreading ∆W max , as shown in Fig. S6(b). Therefore, in the following, we use ψ = ψ 0 in Eq. (S17) as the global phase.

FOV DEPENDENCE (ANIMATION)
Supplementary Video 1 shows how the ideal transmission matrix (in both bases) and the phase-shift profiles evolve as the FOV increases. Figure S7 provides the animation caption and shows one frame of the animation. Increasing the FOV leads to an increased phase-shift variation among incident angles, which widens the diagonal of the spatial transmission matrix. Figure S8 shows the middle column of the spatial transmission matrix for ideal wide-FOV lenses with different FOVs. With a small FOV, the output profiles are reasonably close to being rectangular, and the inverse participation ratio (IPR) coincides with the full width at half maximum (FWHM). With a large FOV, the output profiles develop two strong peaks, and the IPR underestimates the output width.   ig. S10. Optimized maximal lateral spreading as a function of the numerical aperture NA for different lens diameter D out and FOV.